Group 26

Carl Felix Freiesleben - s203521
Emilie Munk - s203538
Josefine Løken - s183784
Judith Tierno Martí - s222869
Sahand Yazdani - s203538

Intro

The analysis was performed on the dataset: Right Heart Catheterization (RHC) Dataset, first analysed Connors (et. al) (1996)

  • It focuses on the effect RHC has on the patients

  • Used propensity score matching to create an artificial control group

  • Their study found that patients undergoing RHC experienced shorter survival times.

  • Attribute datta includes patient demographics, socioeconomic details, physiological parameters, disease-related information, and survival outcomes.

Materials and Methods

We performed our analysis using \(\color{red}{\text{Tidyverse}}\).

Before cleaning and augmentation:

  • 5735 patients

  • 62 attributes

After cleaning and augmentation:

  • 5612 patients

  • 53 attributes

Materials and Methods

Familiarize ourselves with the data by extracting different information about the attributes and made numerous plots

Created summaries of different attributes, to find what makes sense to analyse

We used histograms because they are easy to read and interperate, while also showing a lot of information

Table 1

rhc_aug |> mutate(sex = factor(sex),
                    swang1 = factor(swang1),
                    death = factor(x = death, levels = c(0,1), c("Alive","Dead"))) |> 
  table1(x = formula(~ sex + age + race + swang1 | death),
         data = _)
Alive
(N=1972)
Dead
(N=3640)
Overall
(N=5612)
sex
Female 906 (45.9%) 1594 (43.8%) 2500 (44.5%)
Male 1066 (54.1%) 2046 (56.2%) 3112 (55.5%)
age
Mean (SD) 56.6 (17.4) 64.0 (15.7) 61.4 (16.7)
Median [Min, Max] 58.0 [18.0, 102] 66.0 [18.0, 101] 64.0 [18.0, 102]
race
black 323 (16.4%) 577 (15.9%) 900 (16.0%)
other 121 (6.1%) 223 (6.1%) 344 (6.1%)
white 1528 (77.5%) 2840 (78.0%) 4368 (77.8%)
swang1
0 1291 (65.5%) 2177 (59.8%) 3468 (61.8%)
1 681 (34.5%) 1463 (40.2%) 2144 (38.2%)

Investigating the mean blood pressure with different diseases

  • Bimodal distributions
  • Seems that mean blood pressure is higher in patients without RHC

How Your Medical Insurance Influences Your Survival Chances

  • Most patients have low income
  • Patients are mostly covered by Medicare or Private medical assurance
  • Individuals in lower income categories have a higher mortality
  • Individuals covered by Medicare have the highest mortality

PCA

Modelling

# A tibble: 10 × 4
   Diagnosis          Coefficient Intercept   p.value
   <chr>                    <dbl>     <dbl>     <dbl>
 1 multiple diagnosis     0.0180     -0.163 0.359    
 2 seps                   0.0292     -1.30  0.00222  
 3 card                   0.0257     -0.754 0.0000221
 4 resp                   0.0240     -0.871 0.0000290
 5 renal                  0.00959     0.429 0.684    
 6 hema                   0.0191      0.185 0.863    
 7 gastr                  0.0272     -0.771 0.00937  
 8 trauma                 0.0170     -1.56  0.147    
 9 neuro                  0.0183     -0.223 0.358    
10 meta                   0.0156     -0.552 0.468    

Discussion

How come we found no major discoveries?

What could have been done differently?

Conclusion

We can conclude that PC can make sense for further analysis.

We can conclude that high values of APS for several diagnosis, will increase the risk of death

Sources: https://hbiostat.org/data/repo/rhc

download: https://hbiostat.org/data/repo/rhc.csv